Record Linkage Using Stata: Preprocessing, Linking, and Reviewing Utilities
نویسندگان
چکیده
منابع مشابه
Efficient Record Linkage Algorithms Using Complete Linkage Clustering.
Data from different agencies share data of the same individuals. Linking these datasets to identify all the records belonging to the same individuals is a crucial and challenging problem, especially given the large volumes of data. A large number of available algorithms for record linkage are prone to either time inefficiency or low-accuracy in finding matches and non-matches among the records....
متن کاملRecord Linkage
R linkage, in the present context, is simply the bringing together of information from two records that are believed to relate to the same entity—for example, the same individual, the same family, or the same business. This might involve the linking of records within a single database to identify duplicate case records. Alternatively, record linkage might involve the linking of records across t...
متن کاملPrivacy-preserving record linkage using Bloom filters
BACKGROUND Combining multiple databases with disjunctive or additional information on the same person is occurring increasingly throughout research. If unique identification numbers for these individuals are not available, probabilistic record linkage is used for the identification of matching record pairs. In many applications, identifiers have to be encrypted due to privacy concerns. METHOD...
متن کاملUsing Structured Neural Networks for Record Linkage
We report on our continuing work on pedigree-based record linkage. In particular, we show how a structured neural network can be designed to learn weights across pieces of information and how the inherent skewness of the data can be reduced by filtering, or blocking, through a series of these networks. The results, both quantitative and qualitative, are encouraging.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Stata Journal: Promoting communications on statistics and Stata
سال: 2015
ISSN: 1536-867X,1536-8734
DOI: 10.1177/1536867x1501500304